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ABSTRACT - - - 

A frequently used method of measuring teaching 
effectiveness is the anonymous student rat ing of the instructor at 
the end of a grading period . The validity of these ratings for 
faculty personnel decisions has been a source of controversy. The 
purposes of this study were to examine the differences in teaching 
effectiveness between selected courses; investigate_the effects of 
course. type and course level on measures of teaching effectiveness, 
as wellas possible interactions between type and level; and examine 
the differences between measures of effectiveness and their 
reliability. Data consisted of students^ evaluations of undergraduate 
and graduate mathematics course instructors. Analysis of variance was 
used tb_cdmpare the mean ratings from 20 mathematics courses in the 
study. Results indicated that the particular course, its type, and 
its level are important factors to consider when using student 
evaluations of teaching performance. (DWH) 
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The Jnflaence of Course Ujjdn . . . 

Measares of Teaching Effectiveness in Hathematics 

Research on evalaations of classroom teaching has produced various 
measures of teaching effectiveness. Self evaluation and peer evaluation 
have been used occasionally, with less than clear results. A more often 
used method of measuring teaching effectiveness is with anonymous student 
ratings of the instructor at the end of a grading period. 

A controversy exists in the literature cbricerning the validity of 
using students' ratings for faculty personnel decisions. The findings of 
Marsh (1982) demonstrated agreement betweefi student: and instructors on 
evaluations of teaching effectiveness, arid support the validity of student 
ratings. Dowell and Neal (1982) provide a review of studies which have 
attempted to liriR studerit ratirigs to student learning as a way of vali- 
dating student ratirigs as a measure of teaching ability. The validity of 
student ratirigs is qUite Variable, and is at best only modest. They recom- 
merid that studerit ratirigs be used with great caution iri the processes of 
faculty review arid decision making. Hills (1974) concluded that student 
ratirigs of faculty could not be trusted when determiriirig pay iricreases* 
prdmdtidrij arid tenure. 

• Numerdus studies have been conducted to irivestigate the factors that 
may irifluerice or bias students* ratings, fleffmari arid Kremer (1980) found 
student attitude and instructor attitude as perceived by the student to be 
important variables in predictirig studerit ratings of the instructor. 
Personality characteristics of the Iristructdr have aiso been shown to 
Influence evaluations (BrasRamp, 0ry, arid Pieper, l98l; Abrami, Perry, and 
Leventhal, 1982). The relationship betweeri grades and instructor ratings 
has often been addressed. Several authdrs (Abrami, Dickens, Perry, and 
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bevehthal, 1980; Ducette arid Keririy^ 1982) have found grades to have signi- 
f leant effect dri ratings^ while others (Howard and M^jtWeH, l98d, 1982) 
argue against a grading leniency bias model. 

Other studies focus on how student and course variables relate to stu- 
dents' evaluations of teaching. Marsh (1980) examined the relationship 
between student evaluations and certain background characteristics. 
Favorable student ratings were correlated with prior subject interest, 
higher expected grades* higher levels of workload difficultj^, and a higher 
percentage of students taking the course for general interest only. 
Overall and Marsh (1980) investigated the relative contribtition of course 
level (undergraduate versus graduate), course type (accbuntihg, eebnomics, 
etcJt and the specific instructor oh students' evaltiatidn* The variance 
which could be attributed to the specific ihstructbr was rnU^h greater thari 
that due to course level or course type. Hhb teaches a Course appeared to 
be relatively more important than the particular course or the level at 
which it is taught i 

Greene, Prather, arid Sturgebri (1983) have iritroduced a unique and 
unobtrusive measure of teaching effectiveriess. It is based on observable 
student behavibr, arid makes use of existing administrative data. This 
measure is the riumber bf times students return to a particular teacher for 
additibrial eburses. There is evidence that this measure of students' 
repeatirig faculty members can be a valid indicator of teaching effectiveness 
Prather^ Massey^ arid Greene (1983) found the repeat measure clearly related 
to students' ratirigs of instructors in introductory statistics courses. 
Students repeatirig a given faculty member was also found tb be associated 
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with fiigher students' evaluatidris of instructors in mathematics courses 
(Prather, Hassey, Greene^ arid Sturgeon, 1984) • 

This study focuses upon several measures of teaching effectiveness in 
mathematics courses, these are seven items of a teaching performarice scale 
as well as the previously discussed unobtrusive measure of students 
repeating an irstructor. 

the purposes of the study are 1) to examine differences in teaching 
effectiveriesc between selected courses, 2) to investigate the effects of 
course type arid course level on measures of teaching effectiveness, as well 
as possible iriteractioris between type and level, and 3) to look at dif- 
fererices amdrig the measures of effectivenss and at the reliability of such 
measures* 

Method 

Data 

The data consist of students* evaluations of uridergraduate and gra- 
duate mathematics course instructors for the period 1979 to 1982. A total 
of 20 courses, 590 classes, and 9144 evaluatidris was considered. An 
example of the evaluation iristrumerit and the way it is scored can be 
found in the Appendix. Both service cdurses arid courses for mathematics 
majors were included. 
Proced u r e 

Analysis of variance is Used td compare the mean ratings from the 
20 individual mathematics courses. The independent variables are the 
course (Intermediate Algebra, Calculus I, etc.), the course type (service 
versus for degree majdrs), arid course level (freshman, sophomore, upper 
divisidri, arid graduate). Among the dependent variables are scores dri the 
seven items of a teaching performance scale. The items ask the student if 
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the Instructor: 1) was ..ell -prepared; 2) stimulated student thinking; 
3) was actively helpful to students; 4) explained course objectives; 5) was 
failr and impartial in grading; 6) explained difficult material; and 7) if 
the students felt they learned a great deal. An average of the seven items 
is included as well. 

A "Student Repeats Per Course" variable was calculated by counting the 
number of times each student in a particular class had previously been in a 
class with that same instructor, and dividing this total by the class size. 
For example, if only 2 students in a class of 20 had each had their current 
instructor for one other class, the value of the "Repeats" for that class 
would be .10. The "Repeats" variable is simply the mean of this "Repeats- 
measure for all classes of a particular course. 

lesul ts 

The means of each item, of the average of the seven items, and of the 
"Repeats" are presented for each course in Table 1. F ratios and levels of 
significance for each dependent variable are also given. 

For all courses combined, the item "Well -Prepared" had the highest 
rating (4.54), while the item "Explains Difficult Material" had the lowest 
(3.95). Significant differences (p <.01) over the twenty courses were 
found for all dependent measures except the items "Course Bbjectives 
Explained" and '^Grades Fair arid Impartial." 

In Table 2 are preserited the results of the factorial style analysis 
of variarice by course type and course level. Courses for degree majors 
were rated significantly higher (p<.05) than service courses on all 
variables except "Hell-Prepared," "Course Objectives Explained." and 
"Grades Fair and impartial." The value of "Repeats" was .23 for service 
courses arid 1.07 for major courses. 
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tipper division courses were rated the highest on most measures; they 
were hot rated highest on "Course Objectives Explained," "Explains 
eifficult Hatiriali" and "Students Learn Great Deal." The value of 
"Average Repeats" for upper division courses was 1.05. 

Graduate Courses were rated lowest for the following items: 
"Well-Prepared," "Course Objectives Explained," "Explains DiffieuU 
Material," and the average of the seven items. "Stimulated Student 
Thinking" and "Students Learn Great Deal" were rated lowest for freshmen 
courses. Significant differences (p<.05) between levels were found for 
"Grades Fair and Impartial," "Students Learn Great Seal," arid "Average 
Repeats." Significant interaction effects were found for "Well-Prepared" 
(p <.d5) and for the "Repeats" variable (p^.01). 

A repeated measures type of analysis of variariee was perfdrmed using 
the seven measures of teaching effectiveness arid selected courses having an 
N of ten or more classes. An arialysis for all twerity courses was also 
included. These results are preserited iri fable 3. Significant differences 
betweeri items were found for each selected course as well as for all cour- 
ses combiried. Reliability coefficients were computed to provide on esti- 
mate of the level of ebrisistency across the seven items. These reliability 
coefficierits (Crdribach's alpha) ranged from .95 to .96, while the staridar- 
dized item alpha coefficierits ranged from .95 to .97. This degree of sta- 
bility is cbrisidered high iri terms Of measurement applicatioris (Stariley, 
1971). 

Conclusions & Implication s 
The purpose of this paper has been to investigate the iriflueriee of 
course arid characteristics of the course on measures of teaching effec- 
tiveness. Data on 20 courses, 590 classes, arid 9144 evaluatibris over a 
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four-year period were ased in several analysis of variance procedures. 

Differences between courses were found for five Of the seven items as 
well as for the average of the seven items, these were "Well-Prepared," 
"Stimulated Student ThinRing," "Actively Helpful to Students," "Explains 
Difficult Material" and "Students Learn Great Deal." As would be expected, 
there were differences between courses for the "Repeats" measure, with 
higher values observed for upper division courses and for graduate courses 
for degree majors. 

Differences between types Of course were found for four items, the 
average of the items, arid the "Repeats." Differences between levels were 
fbdrid for oriiy two items arid "Repeats." Repeated measures of variarice 
yielded relatively high coefficients of reliability as well as sigriificarit 
differerices betweeri the seven Items. 

Previous research has shown course variables to affect studerit ratings 
of iristructor performance in college level courses iri gerieral. the results 
of this study iridicate the importance of takirig irito accdurit the particular 
course* its type, and Its level when mtkirig use of studerit evaluations of 
teaching performance in mathematics courses. 

8 12 



Ref'jrences 

Abrjmi, P.e., DickehSs W.J.* Perry^ R.P.^ & Lewenthal, L. (1980). Do 
teacher standards for assigning grades affect student evaluatjons- . 
of iristruetibri? Journal of Educat ional Psyc hology . i£ . (1), 107-118; 

Abrami, P.G.i Perry^ R.P.* & Levanthan, L. (1982). The relationship 
between student personality characteristics, teacher ratings, and 
student achievenient. Journal of ^ducational Psyc hology , 7±i (1), 
11-125. 

Braskaitipi L.A.* Ory, J.C., & Pieper, D.M. (1980). Student written cbntnerits: 
Dimehsidris Of instructional quality. Journal of Ed u c atlbhal 
Psychology , 73, (1), 65-70. 

Dbwell, D.A.i 5 Neal, J.A. (1982). A selective review of the validity of 
student ratings of teaching. Jour nal of Hi g her Educ atioh , 53, (1), 
51-62. 

DUCette* J.i & Kenny. J. (1982). Do grading standards affect student eva- _ 
lUatibns of teaching? some new evidence on ah old questibh. Jburrial of 
Educational Psychology . 7|. (3). 308-314. 

Greene^ J.* Jr., Prather, J.E., | Sturgeon, d.E. (1983, Hay). Using 

administrative datii a s unobtrusive ^indicators of teaching perfbriiiance . 
Paper presented at The Association for institutional Research Forum, 
Toronto, Ontario, Canada. 

Hills, O.R. (1974). On the use of student ratings of faculty in deter- 
mination of pay, promotion and tenure; Research in Higher Education , 
(4), 317-324. 

Hofman, J.E., & Kremer, t. (1980); Attitudes toward higher education^and 

course evaluation, joarhal of Educational Psychblbgy . 72. (5), 610-617. 

Howard, G.S., & Maxwell, S.E. (1980). Gorrelatibn between student satisfac- 
tion and grades: A case of mistaken causation? Jour^t al of Educati ^jnal 
Psychology , 72, (6), 810-820. 

Howard, S.S., & Maxwell, S.E; (1982). DO grades contaminate student eva- 
luations of ihstructibh? Research in Higher E ducation , i&, (2), 
175-188; 

Marsh, R.W; (1980). The influence of student^ course, and instructor charac- 
teristics in evaluatibns bf university teaching. Americah E ducatibnal 
Research Jburnali 17^ (1), 219-237. 

Marsh, H.H. (1982). Validity bf students* evaluations bf college teaching: 
A multitrait, multimethod analysis. ^Joxmnal of Educatibna l Psychology^ 
74, (2), 264-279. 



Overall, J.U., & Harsh, h.h. nQflO. April), tolative Inflpence of coarse 
a^j^e O^arse type, and instructor on studeBts j evalu a t i o n of mstrue- 
jlott^ P^aper presented at the meeting of the American Educational 
l?esearch Assoc iati on i Boston. (Eric Document Reproduction Service 
No. ED 187 176) 

Prather, 3.1. , Hassey, F.A.^ & Greene, 3., Or . (1983, August? . Students'^ 
cval uatibhs of instructors in introducto ry statisti cs courses . ^Papef* 
presented at the Joint Statistical Meetings, Toronto, Ontario, Canada 

Prather, d.E., Hasseys F.A., Greene» J., Jr., & Sturgeon, J.S. (l9B4i 
January) . Evaluations by s tudents in mathematics courses of the 
effectiveness of teaching . Paper presented at the meeting of the 
American Mathematical Society, Louisville, KY. 

Stanley, J.e. (1971). Reliability. In R.L. Thbrndike (Ed. ) , Educational 
Measurement (pp. 356-442). Washington, DC: American Eouncil on 
Education. 



ERIC 



IB 14 



